Bayesian Modeling of Lexical Resources for Low-Resource Settings
نویسندگان
چکیده
Lexical resources such as dictionaries and gazetteers are often used as auxiliary data for tasks such as part-of-speech induction and named-entity recognition. However, discriminative training with lexical features requires annotated data to reliably estimate the lexical feature weights and may result in overfitting the lexical features at the expense of features which generalize better. In this paper, we investigate a more robust approach: we stipulate that the lexicon is the result of an assumed generative process. Practically, this means that we may treat the lexical resources as observations under the proposed generative model. The lexical resources provide training data for the generative model without requiring separate data to estimate lexical feature weights. We evaluate the proposed approach in two settings: part-of-speech induction and lowresource named-entity recognition.
منابع مشابه
The Continued Utility and Viability of Dakin’s Solution in Both High- and Low-resource Settings
Healthcare is expensive and often inaccessible to many. As a result, surgeons must consider simple, less expensiveinterventions when possible. For wound care, an older but quite effective cleaning agent is Dakin’s solution (0.5%sodium hypochlorite), an easily made mixture of 100 milliliters (ml) bleach with 8 teaspoons (tsp) baking soda into agallon of clean water or 25 ml ble...
متن کاملModeling and solving multi-skilled resource-constrained project scheduling problem with calendars in fuzzy condition
In this study, we aim to present a new model for the resource-constrained project scheduling problem (RCPSP) considering a working calendar for project members and determined the skill factor of any member using the efficiency concept. For this purpose, the recyclable resources are staff resources where any person with multiple skills can meet the required skills of activities in a given time. ...
متن کاملA Bayesian decision model for drought management in rainfed wheat farms of North East Iran
Drought is a feature of climate that can occur in virtually all climates. Therefore, it is aninevitable global but site-specific phenomenon which requires tools to predict and strategies andoptions to cope with it. In this research, the ability and effectiveness of the Bayesian DecisionNetworks (BDNs) approach in decision-making and evaluating drought management options forrainfed wheat product...
متن کاملSemantic and Bayesian Profiling Services for Textual Resource Retrieval
This paper presents an integrated approach to textual resource retrieval, which combines logical inference services with user profiles, in which a structured representation of the user interests is maintained. Learning is performed on documents which have been disambiguated by exploiting the WordNet lexical database, in an attempt to discover concepts describing user interests. The proposed app...
متن کاملAutomatic Story Segmentation using a Bayesian Decision Framework for Statistical Models of Lexical Chain Features
This paper presents a Bayesian decision framework that performs automatic story segmentation based on statistical modeling of one or more lexical chain features. Automatic story segmentation aims to locate the instances in time where a story ends and another begins. A lexical chain is formed by linking coherent lexical items chronologically. A story boundary is often associated with a significa...
متن کامل